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[57] ABSTRACT 

A method of temporally adaptive filtering of the frames of an 
image sequence is disclosed. To filter the kth frame, first, the 
motion trajectories that transverse the pixel locations at the 
kth frame are determined using a motion estimation algo- 
rithm. Motion trajectories pass through a r^edetermined 
number of frames neighboring the kth frame. Image values 
at the neighboring frames, along motion trajectories are then 
evaluated. Temporally adaptive linear minimum mean 
square error (LMMSE) estimates of the image values at the 
pixel locations of the kth frame are determined using the 
image values along the motion trajectories. These steps are 
repeated for each of the other frames of the image sequence. 

18 Claims, 6 Drawing Sheets 
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METHOD FOR TEMPORALLY ADAPTIVE 
FILTERING OF FRAMES OF A NOISY 
IMAGE SEQUENCE USING MOTION 
ESTIMATION 



FIELD OF THE INVENTION 

The present invention relates to noise reduction in digital 
image sequences that can be obtained from various video 10 
sources as well as motion picture film. 

BACKGROUND OF THE INVENTION 

Motion estimation, being important in its own right in the 15 
broad area of computer vision, is of utmost importance in 
almost every aspect of image sequence processing. Inter- 
frame motion information allows the development of algo- 
rithms that take advantage of naturally existing redundancies 
among the frames of an image sequence. The difficulties 
inherent to the motion estimation problem, and developing 
novel strategies for utilizing the interframe motion informa- 
tion in the context of various processing tasks, pose chal- 
lenges in the field of image sequence processing. 

Image sequence processing is concerned with problems 
such as interframe motion estimation, temporal frame inter- 25 
polation, noise filtering, restoration and data compression. 
The present invention is concerned with two of these prob- 
lems in particular: motion estimation and noise filtering. 

The importance of reducing the noise in image sequences 
is growing with the increasing use of video and television 30 
systems in numerous scientific, commercial and consumer- 
oriented applications. A human observer can potentially 
obtain more information from an image sequence when the 
noise is reduced. In cases where the noise is not visually 
perceivable, reduction of noise increases the efficiency of 35 
subsequent processing that may be applied to the image 
sequence, such as data compression. 

There are two major temporal-domain approaches to 
image sequence filtering: (1) the motion-compensation 
approach, and (2) the motion-detection approach. In motion- 40 
compensated filtering, first a motion estimation algorithm is 
applied to the noisy image sequence to estimate the motion 
trajectories^ i.e., locations of pixels (or subpixels) that cor- 
respond to each other at a predetermined number of con- 
tiguous image frames. Then, the value of a particular pixel 45 
at a certain frame is estimated using the image sequence 
values that are on the motion trajectory passing through that 
pixel. The estimation is performed using either an infinite 
impulse response (HR) or a finite impulse response (FIR) 
filter structure. 

50 

In contrast, methods based on motion detection do not 
attempt to estimate the interframe motion. Instead, direct 
differences of pixel values at identical spatial locations of 
two adjacent frames are computed to detect the presence of 
interframe motion. An estimate of the pixel value at a certain 
location of the present frame is determined by applying an 55 
FIR or an HR filter structure to pixels at identical spatial 
locations of a predetermined number of past and/or future 
frames. The filter coefficients are functions of the "motion- 
detection signal" which is defined as the difference between 
the pixel value of interest at the present frame and the pixel 60 
value at the same location of the previous frame. Certain IIR 
filter structures for temporal filtering on the basis of motion 
detection have been proposed in the prior art, as well as a 
variety of other motion-detection based filtering methods. 

Generally speaking, the performance of these two 65 
approaches is determined by the filter structure, dependence 
of the filter structure to the motion-detection signal (in case 



2 

of the motion-detection approach), and the performance of 
the motion estimation algorithm (in case of the motion- 
compensation approach). Motion-compensated filtering 
methods tend to be more complex due to interframe motion 
estimation. On the other hand, they are potentially more 
effective than those based on motion detection because they 
make use of the interframe motion information. In practice, 
however, the success of a motion-compensated method is 
strongly dependent on the success of motion estimation. 

In an ideal setting, where the scene contents remain 
unchanged from one frame to another and the motion 
estimation algorithm is not affected by noise, direct averag- 
ing of image values over motion trajectories provides effec- 
tive noise reduction. In fact, under independent white Gaus- 
sian noise assumption, the average is a maximum likelihood 
estimate of the pixel value. In practice, however, scene 
contents change from one frame to another, e.g., due to 
camera panning and existence of covered/uncovered 
regions. As a result, image values over an estimated motion 
trajectory may not necessarily correspond to the same image 
structure and direct averaging may result in oversmoothing 
of image details. Therefore, the noise filtering algorithm 
should be temporally adaptive. In one extreme, when the 
motion estimation is accurate, it should approach direct 
averaging. In the other extreme, when the motion estimation 
is inaccurate, it should not perform any filtering. Indeed, the 
motion estimation method should be able to provide good 
estimates in the presence of noise as well as in the case of 
varying scenes in order to allow for effective noise reduc- 
tion. 

The adaptivity requirement outlined above is satisfied 
when the local linear minimum mean square error 
("LMMSE") point estimator that has been derived by Kuan, 
et al. and by Lee is applied in the temporal direction along 
the motion trajectories (See: D. T. Kuan et al., "Adaptive 
Noise Smoothing Filter for Images with Signal-Dependent 
Noise", IEEE Trans. Pattern Anal. Machine IntelL, PAMI-7, 
pp. 165-177, March 1985; and Lee, "Digital Image 
Enhancement and Noise Filtering by Use of Local Statis- 
tics", IEEE Trans Pattern Anal Machine Intel]., PAMI-2, pp. 
165-168. March 1980.) 

It was suggested by Martinez et al. ('Implicit Motion 
Compensated Noise Reduction of Motion Video Scenes", 
Proc. ICASSP, pp. 375-378, Tampa, Fla. 1985) to apply the 
adaptive LMMSE point estimator in the temporal direction. 
Due to the lack of motion estimators that are robust in the 
presence of noise, however, Martinez et al. used a cascade 
of five LMMSE estimators over a set of five hypothesized 
motion trajectories for each pixel, without estimating the 
actual motion. This approach can be regarded as a motion- 
detection approach rather than a motion-compensated 
approach since interframe motion is not estimated. Motion 
detection along a hypothesized trajectory is implicit in the 
adaptive nature of the estimator. Due to the adaptive nature 
of the estimator, filtering is effective only along the trajec- 
tory that is close to the actual one. This approach has been 
reported to be successful in cases where the hypothesized 
motion trajectories are close to the actual ones. 

SUMMARY OF THE INVENTION 

There is a need for a method of reducing noise in image 
sequences that uses the local LMMSE point estimator over 
motion trajectories that are explicitly estimated using a 
robust motion estimation algorithm. 

This and other objects are achieved by the present inven- 
tion which provides a method of temporally adaptive filter- 
ing of the frames of an image sequence. To filter the kth 
frame, first, the motion trajectories that pass through the 
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pixel locations at the kth frame arc determined using a 
motion estimation algorithm. Motion trajectories pass 
through a predetermined number of frames neighboring the 
kth frame clarifying definition of motion trajectory. The 
motion trajectory passing through a pixel location in the kth 
fame is denned as the set of spatial locations in the neigh- 
boring frames that correspond to that pixel in the sense of 
motion, lb determine the motion trajectory passing through 
a certain pixel location, one should estimate the displace- 
ment of the image value at that pixel location in the 
neighboring frames using a motion estimation algorithm 
The displaced locations in the neighboring frames from the 
motion trajectory. Temporally adaptive linear minimum 
mean square error (LMMSE) estimates of the image values 
at the pixel locations of the kth frame are determined using 
the image values along the motion trajectories. Since dis- 
placements can in general be at subpixel levels, a motion 
trajectory may pass through subpixel locations in the frames 
neighboring the kth frame. Therefore, image values should 
be determined at subpixel locations prior to filtering. Image 
values at subpixel locations are determined using one of the 
well-known spatial interpolation techniques, such as bilinear 
interpolation. These steps are repeated for each of the other 
frames of the image sequence. 

A specific motion estimation algorithm that can be used in 
the present invention is the subject of a pending patent 
application, Ser. No. 275,859 filed on Nov. 25, 1988, and 
herein expressly incorporated by reference, and was 
invented by one of the inventors of the present invention. 
This motion estimation algorithm, hereinafter the "Fogel 
algorithm", is very well suited for filtering noisy image 
sequences because (i) it is extremely insensitive to noise, 
and (ii) it provides good estimates in the case of varying 
scene content Its robustness is due to the fact that it avoids 
direct differentiation of the image distribution. It imposes the 
optical flow and directional smoothness constraints adop- 
tively using a novel variational principle. This latter feature 
of the Fogel algorithm is responsible for its good perfor- 
mance in the case of covered/uncovered region and camera 
panning situations. 

ITie method of the present invention performs temporally 
adaptive LMMSE filtering along motion trajectories which 
are determined using a robust motion estimation algorithm. 
The noise is assumed to be white. However, it may be 
signal-independent or signal-dependent. The method of the 
present invention is superior to an implicit motion-detection 
method as well as other motion-compensated methods using 
pel-recursive motion estimation. Also, the method of the 
present invention is very effective in suppressing the film- 
grain noise, without smearing the image details, for image 
sequences digitized from motion picture film. 

Other objects, advantages and novel features of the 
present invention will become apparent from the following 
detailed description of the invention when considered in 
conjunction with the accompanying drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 shows a sequence of image frames to illustrate a 
first embodiment of the method of the present invention. 

HG. 2 shows a sequence of image frames to illustrate 
another embodiment of the method of the present invention. 

FIG. 3. illustrates a demonstration of the nature of c, (r,d a ) 
using a set of hypothetical G and § a values: Ci versus G and 
$ a for {u=6.0; p=0.4; and q=0.5}. 

FIG. 4(a) shows a slice of the plot in FIG. 3 at a certain 
G value that is compared to curves obtained by individually 
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varying the value of the parameter p: (u=6.0; p=0.4; and 
q=0.5) and (u=6.0; p=0.8; and o=0.5). 

FIG. 4(b) shows a slice of the plot in HG. 3 at a certain 
G value that is compared to curves obtained by individually 
varying the value of the parameter q: (u=6.0; p=0.4; and 
q=0.5) and (u=6.0; p=0.4; and q=1.0). 

FIG. 4(c) shows a slice of the plot in FIG. 3 at a certain 
G value compared to curves obtained by individually vary- 
ing the value of the parameter u: (u=6.0; p=0.4; and q=0.5) 
and (u=10.0; p=0.4; and q=0.5). 

FIG. 5 shows a method for filtering a noisy image 
sequence according to an embodiment of the present inven- 
tion. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

The present invention provides an algorithm for temporal 
filtering of noisy image sequences, which utilizes the esti- 
mate of the interframe motion. This motion-compensated 
temporal filtering approach makes use of interframe motion 
information. A motion estimation algorithm that has been 
adapted for use as the algorithm of the present invention is 
the Fogel algorithm, although other robust algorithms may 
be developed that would also be suitable. The Fogel algo- 
rithm is very well suited for noise filtering applications due 
to its robust nature. The basic mathematical principles of the 
Fogel algorithm is described in S. V. Fogel, 'Estimation of 
velocity vector fields from time-varying image sequences,' 
CVGIP: Image Understanding, Vol. 53, No. 3, May, 1991. 
For the purposes of the present discussion, the basic prin- 
ciples of the Fogel algorithm are presented, rather than its 
mathematical details. 

In the present invention, a noisy image sequence, g(x,y,t), 
is modeled by: 



Eq. CD 



where f(x,y,t) and v(x,y,t) denote the actual image and noise 
distributions, respectively, at continuous spatial and tempo- 
ral coordinates (x,y,t). A discrete model can be defined in 
terms of the discrete coordinates (nyi,k) as g(m,n,k)=f(m, 
n,k)+-v(m,n,k), where g(m,n,k), f(m,njc), v(m,n,k) denote 
the sampled versions of the corresponding quantities in 
Eq.(l). 

Assume that the frame at time k of a given image 
sequence is filtered using its N neighboring frames including 
itself. Without loss of generality, also assume that N=2M+1 
and that the frames at times k-M, . . . k-1, k, k+1 . . . , k+M 
are used to filter the frame at time k. (See FIG. 1 for an 
illustration of the frames for k=0 and M=2.) 

In the first step of the present invention, motion trajecto- 
ries that pass through the pixel locations at the kth frame are 
determined using the motion estimation algorithm. A motion 
trajectory is defined for each pixel in the kth frame. For a 
pixel located at (m,n) in the kth frame, the motion trajectory 
is defined as the set of locations containing (mji) as 
well as the displaced locations corresponding to (m,n) in the 
N-l neighboring frames at times k-M, . . . k-1, k+1, . . . , 
k+M. The displacement vectors are estimated by the motion 
estimation algorithm FIG. 1 shows the pixel located at 
(m,n)=a at frame k=0, and the corresponding (in the motion 
sense) locations b,c,d, and e in the neighboring four frames. 
In this case, the motion trajectory is the set T 0t o={a,b,c,d,e}. 
In general, the components of the displacement vector 
estimates are real-valued, and thus the trajectories often 
include subpixel locations at which the image values are 
determined via interpolation. 

At the second step of the method of the present invention, 
image values that lie along the motion trajectories are used 
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to determine the temporally adaptive LMMSE estimates 
(i.e., filtered values) of pixel values at the kth frame. This . 
two-step procedure is then repeated for the other frames of 
the sequence. 

In the following, the motion estimation step is discussed 5 
first in more detail, followed by a discussion of temporally 
adaptive filtering. 

Motion estimation algorithms can be classified into three 
major classes: (i) feature/region matching methods; (ii) 
Fourier transform domain methods; and (iii) spatiotemporal- 
gradient based methods. In the feature/region matching 10 
methods, motion is estimated by matching features (e.g., 
edges) or regions (e.g., block of pixels) from one frame to 
another. The displacement vectors are estimated only on a 
sparse set of image points. The Fourier transform methods 
exploit the relationship between the Fourier transforms of 
two-dimensional signals shifted with respect to each other. 
The spauotemporal-gradient based methods estimate the 
motion by using spatial and temporal changes (gradients) of 
the image distribution as well as the displacement vector 
field. The Fogel algorithm, used in certain embodiments of 
the present invention as the motion estimation algorithm, is 20 
a spatiotemporal gradient method. 

There are four major ideas of the Fogel algorithm: (1) 
formation of a multiresolution representation of the given 
image sequence, (2) the use of optical flow and directional 
smoothness constraints on the displacement vector field 
estimate, (3) statement of a novel variational principle used 
to obtain the displacement vector field estimate, and (4) 
solution of the variational principle. For the sake of discus- 
sion, the emphasis is on the concepts of the Fogel algorithm 
rather than its mathematical details. Further, for the sake of 
simplicity, a different notation is used than that contained in 30 
the Technical Report mentioned earlier. Also, the following 
discussion is limited to continuous spatial and temporal 
variables. 

A multiresolution representation of the given image 
sequence, denoted by g^foy.t), is determined by correlating 35 
it with a kernel of the form Eq. (2) 



A(*y) = 



Aexp 
0, 



otherwise. 



25 



J ,C a J I + D 2 y 2 <B J 40 



That is g^x^D^ing^Ti^hau-xVot, (Ti-yya))dud7y 
Here, Ci denotes the image support, A,B,C and D are 45 
predetermined constants, and a is the resolution parameter. 
One can see that the spatial resolution of g a (x,y,t) decreases 
as a increases. It is important to note that the spatial partial 
derivatives of g a can be computed conveniently by corre- 
lating g with the partial derivatives of the correlation kernel so 
which is infinitely difierentiable. Motion estimation is per- 
formed hierarchically, starting from the lowest resolution 
and ending at the original resolution of the image sequence. 

The Fogel algorithm imposes optical flow and directional 
smoothness constraints on the displacement vector field 55 
estimate. These two constraints are introduced in the fol- 
lowing. The estimate of the displacement vector field at 
location r, describing the motion between frames at t and 
t+At, at the resolution level a, is denoted as 



The displaced frame difference function can be used to 
introduce the optical flow constraint. The effects of noise, 
possible changes in the scene illumination, etc., are ignored 
for the moment and the points of a moving structure are 
assumed to attain the same gray level at adjacent frames. In 
that case, if the estimate d a were equal to the true displace- 
ment vector, the displaced frame difference function would 
attain the value of zero at that location. The optical flow 
constraint is said to be satisfied by the displacement vector 
when the displaced frame difference function attains the 
value of zero. Hence, optical flow constraint is imposed as 
the (absolute) value of the displaced frame function is 
decreased. 

It is important to note that the optical flow constraint is not 
sufficient to uniquely detenmne the displacement vector. In 
other words, the equation (j^r.d")^ has, in general, more 
than one solution. In particular, the optical flow constraint 
imposes a restriction only on the component of the displace- 
ment vector that is in the direction of the spatial gradient 
vector at location r. There are no restrictions imposed, 
however, on the component in the direction perpendicular to 
the gradient vector. The solution space for the displacement 
vector estimate, therefore, contains more than one element. 

The size of the solution for the displacement vector 
estimate can be reduced by using the multispectral charac- 
teristics of the image, if available. For instance, if the image 
sequence is available in color, optical flow constraint can be 
imposed in red, green and blue channels. Since the direction 
of the spatial gradient vector at each color channel is 
different in general, the displacement vector is effectively 
constrained in three different directions. This can be viewed 
as increasing the number of "knowns" while the number of 
unknowns remains the same. 

The solution space for the displacement vector estimate 
can be further reduced by using additional constraints, such 
as the smoothness constraint which imposes the requirement 
that the displacement vectors vary smoothly along certain 
spatial directions. In the following, the directional smooth- 
ness function £ a at resolution level a is defined to introduce 
the directional smoothness constraint. 

The directional smoothness function is defined in terms of 
the directional derivatives of the displacement vector. The 
directional derivatives of the x and y components of the 
displacement vector in the direction of the vector sKs^] 7 " 
are defined as Eq. (5) 



Vjd/Kr^-g^- dy a k+asx,y+as y ) 



^r.-r.A/H^/.A/) ^°(r;/,A/)f 



Eq. (3) 



60 



(3) 

where r=Ix y] r . Then, the displaced frame difference func- 
tion, <|> a , for these two frames, at the resolution level a, is 
defined as 



65 



The directional smoothness function is defined by 



.(6) 



Ek].(4) 



Smoothness constraint in the direction of s is imposed when 
the value of this function is small. 

It is important to realize that optical flow and smoothness 
constraints on the displacement vector field are not neces- 
sarily valid at all image points. For instance, optical flow 
constraint does not hold near occlusion boundaries where 
regions are covered or uncovered. Smoothness constraint, on 
the other hand, does not hold near occlusion boundaries in 
the direction perpendicular to occlusion boundaries, where 
the displacement field suddenly changes. 

Early spatiotemporal methods in the prior art used the 
optical flow and smoothness constraints globally on the 
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20 



entire image, resulting in algorithms that provided unsatis- 
factory displacement estimates in general. Later methods 
selectively impose the smoothness constraint. These meth- 
ods required the detection of object boundaries. The smooth- 
ness constraint was then imposed only along the boundaries 
but not perpendicular to the boundaries (without differenti- 
ating between the occlusion and non-occlusion boundaries; 
smoothness constraint does hold in the direction perpen- 
dicular to a non-occlusion boundary). The Fogel algorithm 
has the novel ability to impose both the optical flow and 
directional smoothness constraints adaptively in an appro- 
priate fashion (differentiating between occlusion and non- 
occlusion boundaries) within the framework of a new varia- 
tional principle, as is explained below. 

In order to appropriately impose the optical flow and 
directional smoothness constraints, one should minimize a 
weighted sum (or integral) of the displaced frame difference 
and directional smoothness functions defined above, where 
the weights are adjusted with respect to the validity of each 
constraint. The weight of the displaced frame difference 
function should be decreased at points where optical flow 
constraint is violated. Similarly, the weight of the directional 
smoothness function should be decreased in directions 25 
where the smoothness constraint is violated. Ideally, these 
weights should be functions of the displacement vector field 
since the displacement vector field is the best indicator for 
the validity of these constraints. For instance, a sudden 
change in the spatial gradient of the displacement vector 
signals the presence of an occlusion boundary. (This is 
explained in greater detail below when the explicit forms of 
these weights are discussed). With this in mind, the dis- 
placement field estimate should satisfy the following varia- 
tional principle: the estimate d a , at resolution level cc, should 
be such that the functional 



dPj)¥ds]dT Eq.00 40 



is minimized with respect to 5d a when The first term 
on the right hand side of Eq.(7) is a weighted integration of 
the square of the displaced frame difference function over 
the entire image. In the second term, the notation JjQ ds 
denotes the integration over a predetermined set, S, of 
directions (in the preferred embodiment of the algorithm, 
eight directions are considered); the weighted integral of the 
smoothness function is then integrated over the entire image. 

The weights, which depend on the displacement vector 
estimate, are given by Eq. (8) 



Eq. (9) 



30 



35 
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where 



and 



Bd x <Kr) 



dd x Hr) 
~5T" 



55 



60 



65 



denote the spatial gradients of the displacement vector, and 



where V, is the directional derivative operator defined in 
Eq.(5). The parameters p,q,u,v,w,z are heuristically deter- 
mined using a number of image sequences. Their values are 
robust in general and do not need to be adjusted from one 
image sequence to another. 

Hie following is a discussion of adaptive weighting of 
optical flow and directional smoothness constraints. The 
weight function c l (r 1 d a ) associated with the displaced frame 
difference function will be considered first. At any location 
r, the value of the weight function should be monotonically 
decreasing with increasing values of the displaced frame 
difference function, hence weakening the effect of the opti- 
cal flow constraint (at location r) on the solution of Eq.(7). 
This explains the presence of the term p 2 (((> a (r,d a )) 2 in the 
denominator of the weight function given in Eq.(8). The 
inclusion of the term q^lV^WlVlVdy^r)! 2 )^,^)) 2 in 
the denominator of Eq.(8) is primarily motivated by the fact 
that the weakening of the optical flow constraint should be 
the most significant near occlusion boundaries. Near occlu- 
sion boundaries, both the gradient term G(r, 
d^^dVd/WF+lVdy 11 ^)! 2 ) and the displaced frame differ- 
ence function (^fcd") attain high values due to sudden 
change in the displacement vector field and due to covering/ 
uncovering image structures, respectively. Hence, the com- 
pounding effect of the product of these two quantities 
significantly reduces the weight, resulting in significant 
weakening of the optical flow constraint. As a consequence, 
the weakening of the optical flow constraint due to occlusion 
is more severe than that due to intraframe differences created 
by noise and/or isolated artifacts (e.g., scratches on the film 
from which the image sequence is digitized). 

The purpose of the parameters p and q is to adjust the 
relative effects of the terms G(r,d a ) (§ a (r4 a )f and (^(r, 
d a )) 2 on the value of the weight function. The parameter u 
toermines the value of the weight function when the other 
two terms in the denominator attain small values, especially 
when the optical flow constraint is strongly imposed. 

More insight into the weight function can be gained by 
plotting its values against the displaced frame difference 
function and the displacement field gradient. FIG. 3 shows 
plots ^created by using an array of hypothetical Gfod") and 
<|> a (r,d a ) values using {u=6.0; p=0.4; and q=0.5}. One can 
observe in FIG. 3 that the value of Cxfco*™) decreases with 
increasing <t> a values, and the rate of the decrease increases 
with increasing G(r,d a ) values. The weight attains very 
small values when both of these quantities attain large 
values. A slice of the plot in FIG. 3 at a certain G(r,d a ) value 
(G(r,d a )=10) is compared in FIG. 4 to curves obtained by 
individually varying the values of the parameters p, q and u 
from that used in FIG. 3. The curves for (u=6.0; p=0.4; and 
q=0.5) and (u=6.0; p=0.8; and q=0.5) are compared in FIG. 
4(a). FIGS. 4(b) and 4(c) compare the curves for (u=6.0; 
p=0.4; and q=^).5) and (u=6.0; p=0.4; and q=1.0), and 
(u=6.0; p=0.4; and q=0.5) and (u=10.0; p=0.4 and q=0.5), 
respectively. 

The rate of the decrease of the weight function increases 
with increasing values of p and/or q. Larger values of u, in 
general, results in smaller values of the weight function (see 
FIG. 4(c)). It can be shown that the inflection point of the 
weight function occurs at 
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for a fixed value of G(r,d a ), The weight function c 2 (r,d a ) 
associated with the directional smoothness a constraint is 
identical in form to the weight function c^d**). The roles 
played by V^r) and £*(r,d a ,s) in c 2 (r,d°\s) are similar to 
those played by Gfcd") and (<t>°(r,d a )) 2 in c,(r,d a ), respec- 
tively. 

When the value of the directional smoothness function 
J; a (r,d a ,s) increases at location r, the value of the weight 
function decreases, hence weakening the smoothness con- 
straint at that location in the direction of s. Near occlusion 
boundaries, both £ a (r,d a ,s) and Vj^r) may attain high 
values when s is parallel to spatial gradient vector In that 
case, due to the product term z 2 (V f ^(r) 2 (e a (r,d a ,s)) 2 in the 
denominator of (9), the value of the weight significantly 
decreases, hence weakening the effect of the directional 
smoothness constraint on the solution of Eq.(7). The effect 
of the parameters v, w, and z on the value of c 2 is identical 
to the effects of the parameters u, p and q on the value of Cj. 
One can therefore refer to FIGS. 2 and 3 to gain an insight 
to the effects of these parameters on the weight function 
associated with the directional smoothness constraint. 

The following is a discussion of multispectral extension. 
The advantage of using the different spectral channels of a 
given image sequence in intraframe motion estimation was 
pointed out above. Assume that the red (r), green (g) and 
blue (b) records of the image sequence data are available. 
Taking into account all three channels, the cost function in 
Eq.(7) becomes Eq. (10) 

Wad 0 ) - [ci/r,io OfrAnrf 1 + + 

J Q £j c 1 {r,hj)$«(r,h + &f)) ,2 ds j dr 



where the weight functions are given by Eq. (11) 
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structure used in the present invention to perform temporal 
filtering has been originally derived by Kuan, et al. and Lee 
(discussed earlier) in the spatial domain for removing noise 
from still imagery. Kuan, et al. and Lee used different sets of 
assumptions in their derivations. The assumptions used by 
Kuan, et al. are considered for now because they are more 
appropriate for the present applications than are Lee. For 
instance, unlike the one given by Lee, the derivation pro- 
vided by Kuan, et al. allows for signal-dependent noise. This 
is important in the case of processing sequences digitized 
from a motion picture film, where the noise is signal- 
dependent due to the present of film grain. 

Kuan, et al. addresses the problem of finding an LMMSE 
estimate of the original signal from its noisy observation 
g(m,n)=f(m,n)+v(m,n) (i.e., the filtering problem) where the 
noise can be signal-dependent or signal-independent. The 
signal-dependent noise can in general be modeled as v(m, 
n^yCn^nJf^n^n) where tfm,n) is wide sense stationary with 
zero-mean and unity variance, and independent of the image 
signal. The parameter a is a real number; the film grain noise 
can be modeled with a=V3. It is assumed that (i) the image 
has a nonstationary mean, denoted by u/m,n), and the 
residual p(m,n)=f(ni,n)-u^rn,n) is a white process with 
nonstationary variance a/m,n) (this is referred to as the 
"nonstationary mean nonstationary variance image model' '); 
and (ii) the noise is a zero-mean white process. In Kuan et 
al., the nonstationary mean nonstationary variance model is 
experimentally verified in the case of real-life images using 
local statistics. It is observed that the residual signal appears 
like white noise, especially in uniform image regions. 

The LMMSE estimate f(m,n) has the following form: Eq. 
(13) 



Arrun) = 



35 



where <y v 2 (m,n) denotes the noise variance. If the ensemble 
statistics are replaced by local statistics (on the basis of local 
ergodicity assumption) computed from the observed image, 
we have the local LMMSE estimate: (Eq. (14) 



(J = r,g t b), and 
Eq. (12) 



45 



50 



The solution of the variational principle can be obtained 55 
via calculus of variations, which in this case provides the 
necessary condition that the solution should satisfy: 



= 0 



These conditions give rise to a nonlinear equation which is 
then solved at each resolution level using a Quasi-Newton 
method. The estimate obtained at one resolution level is used 
as the initial estimate for the next higher resolution level 
The temporally adaptive LMMSE filtering performed by 
the present invention will now be described. The filter 



(The "6" notation denote the local statistics.) If we denote 
the sample variance and sample mean of g(m 2 n) computed 
over a local region centered at (m,n) by a s 2 ^m,n) and 
jl^(m,n), respectively, ^ then a/(nvn}=^max[0,cy g 2 (m,n)- 
a v 2 (m,n)] and (l r (m,n^p s (m,n), where c v 2 (m,n) denotes the 
sample variance of the noise process. Note that due to the 
nonHnearity inherent to computing sample variances, the 
local LMMSE estimate becomes a nonlinear estimate. 

In the method of the present invention, the spatially local 
LMMSE filter structure of Eq. (14) is used in the temporal 
domain along motion trajectories. Replacing the spatial local 
statistics with their temporal counterparts, the temporally 
local LMMSE estimate of the image value at location (nui) 
of the frame at time k is given by Eq. (15) 



60 Am.n.*) = 



q/(w.n,-jk) 



of(m,n;k) + <j v 2 (fnt#i;Jfc) 



fi5 where ay-^^y^maxfO^a/Cm^ikJ-o-^^n*)]. The 
quantities a fi 2 (m,n;k) and ^m,n;k)=u 5 (m l n;k) denote the 
sample variance and mean of the degraded image sequence, 
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respectively, computed over the motion trajectory x m nk . 
This applies for the case of filtering the pixel at location 
(m,n)=a in the k=0 th frame. The noisy pixel value is denoted 
by g(a,0). In FIG. 6, the trajectory of interest is T a 6={a,b, 
c,d,e}. The variance o/a^O) and the mean u/a,0) are .esti- 5 
mated from the sample mean and sample variance of the 
values g(c,-2), g(b-l), g(a,0), g(d,l), g(e,2). The estimate of 
g(a,0) is denoted by f(a,0). Note that the "white residual 
assumption" is appropriate along motion trajectories on 
which the signal statistics resemble the signal statistics over 10 
a uniform spatial region, especially at points where a good 
estimate of the displacement field is obtained. 

One should observe that the LMMSE estimator of the 
present invention defined in Eq.(15) satisfies the adaptivity 
requirement outlined earlier. If the displacement estimates 15 
used to determine the motion trajectory are accurate, then 
the temporal signal variance decreases and the estimation 
approaches direct averaging. As the accuracy of displace- 
ment estimation decreases, the estimator deviates from 
direct averaging, and in the extreme when the temporal 20 
signal variance is much larger than the noise variance, the 
estimate approaches to the noisy value. 

In practice, the noise variance may not be known a priori. 
In that case, an estimate of the noise variance can be 
obtained from the sample .variance of pixel values at a 25 
uniform region of the kth frame. This assumes that the noise 
is locally ergodic in both space and time and have identical 
second order statistics. 

Two different embodiments of the method of the present 
invention are illustrated by FIGS. 1 and 2, where N=5 and 30 
frames at k=^-2,-l ,0,1,2 are used to filter the frame at k=0, 
denoted by f(m,n,0). In the embodiment of the method 
depicted in FIG. 1, the displacement vectors from frame 
f(m,n,0) to the others are estimated. The pixel at location a 
in f(m,n,0) is displaced to (possibly) subpixel locations 35 
c,b,d,e in f(m,n,-2), f(m,n -1), f(m,n,l), f(m,n,2), respec- 
tively. For the pixel located at a, the trajectory x <J ={a,b,c,d,e} 
is used in LMMSE filtering. Clearly, if N is large and there 
is significant interframe motion (and hence occlusion), the 
precision of motion estimation is likely to decrease in this 40 
embodiment of the method. In the ideal case of perfect 
displacement estimation and in the presence of signal- 
independent noise, this method reduces the noise variance 
by a factor of N. 

Another embodiment of the present invention, depicted in 45 
FIG. 2, is more precise as far as the estimation of motion 
trajectories is concerned. In this case, the displacement 
vectors from one frame to its two neighboring frames are 
estimated. The filtering is done in M=(N-l)/2^2 steps. In the 
first step, the frames [f(m,n,-2), f(nm,-l), f(m,n,0)]; [f(m, 50 
n,-l), f(m,n,0), f(m,n,l)]; and [f(m,n,0), f(m,n,l), f(m,n,2)] 
are filtered using trajectories T fc ={c,b,a"}, T 0 ={b',a,c'} and 
VHa',c,d}, respectively, to produce the three intermediate 
frames. In the last step, these three frames are filtered using 
the trajectory V={b\a,c'} to determine the estimate rXm^.O). 55 

In the second described embodiment, filtering of a certain 
frame using N=2M+1 frames is performed in M steps, and 
the LMMSE estimation is applied to groups of three frames. 
The process can be efliciently carried out for the other 
frames of the sequence. For instance, in filtering f(m,n,l), 60 
only one LMMSE estimation needs to be performed at the 
first step since the other two intermediate frames required at 
the second step have already been formed during filtering 
the previous frame f(m,n,0). 

It can be verified that, in the ideal case of perfect 65 
displacement estimation, where b=b', a=a'=a" and c=c\ and 
in the presence of signal-independent ncdse, the second " 



embodiment of the present invention reduces the noise 
variance by a factor of F=3(M+l) 3 /2M(M+2)+3 (e.g., F=8 1/ 
19=4.3 for N=5). Finally, note that these two implementa- 
tions are identical for N=3. 

Modifications of the method of the present invention are 
possible without departure from the scope and spirit of the 
invention. Methods other than the above-discussed LMMSE 
filtering, such as median filtering, can be used along the 
motion trajectories to perform temporal noise filtering. In 
fact, almost all of the known "edge-preserving" spatial 
filtering techniques can be extended by one of ordinary skill 
in the art to motion-compensated temporal filtering. (The 
specifics of edge-preserving spatial filtering techniques will 
not be discussed in detail here, as they are well known.) The 
edge-preserving property of a spatial filter enables the 
corresponding motion-compensated temporal filter to meet 
the adaptivity requirement stated earlier, i.e., that less 
smoothing should be performed along motion trajectories 
when the displacement vector estimate is inaccurate in order 
to prevent smearing artifacts to occur. 

The LMMSE filter structure of the present invention 
given in Eq.(15) can be modified to improve noise reduction 
in cases where the scene content changes abruptly from one 
frame to another. Consider for instance the implementation 
given in FIG. 2. In the first step, the filter structure is applied 
to frames f(m,n -2), f(m,n,-l) and f(m,n,0) to produce an 
intermediate frame. Suppose that the frames f(m,n,-2), 
f(m,n -1) contain an indoor scene, but the following frames 
contain an outdoor scene which is substantially different 
than the indoor scene. In this case, the displacement field 
estimate from f(m,n,-l) to f(m,n,-2) is expected to be more 
accurate than the estimate from f(m,n-l) to f(m,n,0). There- 
fore, it may be advantageous to replace the sample mean 
in Eq.(15) by a weighted sample mean which puts more 
emphasis on image values at locations of frames at times 
k=-2 and k=-l whose displacement estimates are more 
accurate, and less emphasis to the corresponding image 
value of the frame at time k=0. Generally speaking, the 
weighted sample mean should take into account the accu- 
racy of intraframe motion estimation by assigning larger 
weights to image values whose displacement estimates are 
more accurate. (It should be noted that the filter structure 
obtained by replacing by a weighted mean is indeed not 
optimum in the minimum mean square error sense.) 

A "motion adaptive" weighted sample mean can be 
defined mathematically as follows: Suppose, for simplicity, 
that M=l, i.e., the frames at times k-1, k, and k+1 are used 
to filter the frame at time k. Then, the weighted sample mean 
along the motion trajectory passing through the pixel 
location (nvn) in the kth frame, denoted by ^r a (m,n;k), is the 
minimizer of the functional 
Eq. (16) 



/(X) 



Jbfl 
£ 

s 2=k-l 

(y)w 



1 +p2(nux[«2 i8im,n.k)-g(i,j,m) 



at X=£y a (m,n;k). Quantities a (a>0) and e are fixed param- 
eters; we shall describe later how they efiFect the weighted 
mean value. Differentiating I(X) with respect to X and 
setting the result equal to zero, we obtain Eq. (17) 



\if(m,n;k) 



where the weight function is given by 
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wtoO-JWmnh 2 . (gto.n.k>-g<ij.r>frr 
and the constant K is defined as 



14 



Eq. (18) 
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l>^y,fc)P(y,*+l)] Eq. (19) 

The quantity p(ij,l) denotes the denominator term in Eq. 
(16), i e.. p(ij,l)=l+a 2 (max[e 2 , (g(m,n,k>-g(Lj,l)) 2 ]). 

The resulting weighted mean will now be analyzed, with 
several remarks being in order. The weight associated with 
the value g(m,n,k) of the cepter frame is fixed: w(m,n,k)= 
K/(l+p 2 e 2 ). Hence the contribution of the central value 
g(m,n,k) to the weighted mean is always the same. If the 
difference between the corresponding values g(m,njc) and 
g(i j,k-l) (or g(in,n,k) and g(i j,k+l)) is large (larger than e 2 ) 
then the contribution of g(ij,k-l) (or g(ij,k+l)) to the 
weighted mean is small The parameter a determines the 
sensitivity of the weight value to the difference terms 
(g(m,n,k)-g(i j,k-l)) 2 and (g(m,n,k)-g(i j,k+l)) 2 . When the 
differences between the corresponding values are merely 
due to the noise, it is desirable that the weighted averaging 
reduces to direct averaging It is therefore plausible to set e 2 
equal to twice the estimate noise variance. In that case, when 
(g(m,njc)-g(ij,k-l)) 2 and (g(m,n k)-g(ij k+1)) 2 are both 
less than e , the weights attain the value A K/(l+p 2 e 2 )^ and 
^/(nvijk) reduces to direct averaging, i e., /^(m.nik^ji^m, 
n;k). 

In an embodiment of the present invention, the motion- 
adaptive mean value is used as the filtered pixel value in 
filtering the image sequences. In other words, the filter is 
defined as f(m,njc)=/iym,n,k). 30 

The various embodiments of the method of the present 
invention as described above can be implemented using 
conventional physical image processing equipment, such as 
a camera, an image digitizer, a computer, and a display, and 
as such, are not shown in detail in order not to obscure the 35 
present invention 

Although the invention has been described and illustrated 
in detail, it is to be clearly understood that the same is by 
way of illustration and example, and is not to be taken by 
way of limitation The spirit and scope of the present 40 
invention are to be limited only by the terms of the appended 
claims 

What is claimed is: 

1. A method of temporally adaptive filtering frames of an 
image sequence in which each image of the sequence has 
pixel locations, comprising: 

(a) subdividing an image sequence into a plurality of 
frames; 

(b) determining motion trajectories of pixels that pass 
through the pixel locations at a kth frame using a 
motion estimation algorithm; 

(c) tetenrrining image values for a pixel originally located 
at a specific pixel location at the kth frame along the 
motion trajector of that pixel to the same or other pixel 
locations in other frames of the image sequence, and 55 
repeating for all pixel locations at the kth frame; 

(d) determining temporally adaptive linear minim um 
mean square error (LMMSE) estimates of the image 
value at each of the pixel locations at the kth frame 
based upon the determined image values; 

(e) repeating steps b, c and d for each of the other frames 
of the image sequence; 

(f) modifying the image values of the pixels at the pixel 
locations at each frame of the image sequence based 
upon the determined temporally adaptive LMMSE esti- 
mates; and 
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(g) generating a temporally adaptive filtered image 
sequence. 

2. The method of claim 1, wherein, for a pixel located at 
(m,n) in the kth frame, the motion trajectory t m/ltk is defined 
as a set of locations containing (m,n) as well as displaced 
locations corresponding to (m,n) in N-l neighboring frames 
at times k-M, . . . k-l, k+1, . . . , k+M, where N=2M+1. . 

3. The method of claim 1, wherein step (b) further 
comprises estimating displacement vectors with the motion 
estimation algorithm. 

4. The method of claim 3, wherein components of the 
displacement vector estimates are real-valued, such that the 
motion trajectories include subpixel locations, and wherein 
step (c) further comprises (f) determining image values at 
the subpixel locations by interpolation. 

5. The method of claim 1, wherein the motion estimation 
algorithm is a spatiotemporal-gradient method. 

6. The method of claim 5, wherein the spatiotemporal- 
gradient method is the Fogel algorithm. 

7. The method of claim 1, wherein step (b) includes the 
steps of: 

(b) (1) forming a multiresolution representation of the 
image sequence; 

(b) (2) constraining a displacement vector field estimate 
using optical flow and directional smoothness con- 
straints; 

(b) (3) obtaining a displacement vector field estimate 

using a variational principle; and 
(b) (4) solving the variational principle. 

8. The method of claim 1, wherein the temporally local 
LMMSE estimate of the image value at pixel location (nyi) 
of the kth frame is determined according to the equation: 

j(m,n,A) = 



ig{m,n,k) - vj.m,n;k)] + p£m,n;k). 



where a/(ni 1 n;k)=Hnax[0 J a/(m,n;k)--a v 2 (in,n;k)], and 
a/(m,n,;k) is the sample variance, and ^n^mk^jl^m^^) 
is the sample mean of a degraded image sequence computed 
over the motion trajectory t m nJb and £ v 2 (m,n) denotes the 
sample variance of a noise process. 

9. Hie method of claim 1, wherein step (b) includes 
estimating the motion trajectories from the kth frame to a 
predetermined number of contiguous frames in the image 
sequence. 

10. The method of claim 9, wherein the predetermined 
number of contiguous frames are used to filter the kth frame. 

11. The method of claim 1, wherein step (b) includes 
estimating the motion trajectories from the kth frame to a 
k-l frame and a k+1 frame. 

12. The method of claim 11, wherein the k-l, k and k+1 
frames are filtered to produce three intermediate frames, 
these intermediate frames then being filtered to determine an 
estimate f(m,rU0 of the image value in step (d). 

13. The method of claim 1, wherein the kth frame is 
filtered using N=2M+1 frames, the filtering being performed 
in M steps, and the LMMSE estimation is applied to groups 
of three frames where N is a predeterxnined number of 
frames. 

14. The method of claim 1, wherein the temporally local 
LMMSE estimate of the image value at pixel location (m,n) 
of the kth frame is toermined according to the equation: 
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<J?(m.n;k) 

; [g(m t n,k) - }if(rrv\;k)] + ^(/i^n/fe). 

cr/(m. n.Jt) + Cv^m. n.Jt) 5 

where a/(in,n;k)=rnax[0,a^ 2 (in^i;k)-a v 3 (ni,n;k)], and 
aj 2 (m,n;k) is the sample variance, and \x a /mji'Xy= 



10 



k+l 



is a motion-adaptive sample mean where the weight function 
is given by 15 

and the constant K is defined as 



the quantity p being given by, 

P(y,D*l*i 2 (max[£ 2 , (g(m,n > AH0',/>0) 2 )). 
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15. The method of claim 1, wherein the temporally local 
LMMSE estimate of the image value at pixel location (m,n) 
of the kth frame is determined according to the equation 
f(m>n,k)=£ a y(m,Ji,AJ where ff/mjiX) is a motion-adaptive 30 
sample mean value and f(m,n,k) is a filtered pixel value. 
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16. A method of temporally adaptively filtering frames of 
an image sequence in which each image of the sequence has 
pixel locations, comprising: 

(a) subdividing an image sequence into a plurality of 
frames; 

(b) determining motion trajectories of pixels that pass 
through the pixel locations at a kth frame using a 
motion estimation algorithm; 

(c) determining image values for a pixel originally located 
at a specific pixel location at the kth frame along the 
motion trajectory of that pixel to the same or other pixel 
Locations in other frames of the image sequence, and 
repeating for all pixel locations at the kth frame; 

(d) determining temporally adaptive estimates of image 
value at each of the pixel locations at the kth frame 
based upon the determined image values; 

(e) repeating steps b, c and d for each of the other frames 
of the image sequence; 

(f) modifying the image values of the pixels at the pixel 
locations at each frame of the image sequence based 
upon the determined temporally adaptive LMMSE esti- 
mates; and 

(g) generating a temporally adaptive filtered image 
sequence. 

17. The method of claim 16, wherein step (d) includes 
median filtering along the motion trajectories to perform 
temporal noise filtering. 

18. The method of claim 16, wherein step (d) includes 
edge-preserving spatial filtering. 
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